Query routing for Web search engines: architecture and experiments

نویسندگان

  • Atsushi Sugiura
  • Oren Etzioni
چکیده

General-purpose search engines such as AltaVista and Lycos are notorious for returning irrelevant results in response to user queries. Consequently, thousands of specialized, topic-specific search engines (from VacationSpot.com to KidsHealth.org) have proliferated on the Web. Typically, topic-specific engines return far better results for “on topic” queries as compared with standard Web search engines. However, it is difficult for the casual user to identify the appropriate specialized engine for any given search. It is more natural for a user to issue queries at a particular Web site, and have these queries automatically routed to the appropriate search engine(s). This paper describes an automatic query routing system called Q-Pilot. Q-Pilot has an off-line component that creates an approximate model of each specialized search engine’s topic. On line, Q-Pilot attempts to dynamically route each user query to the appropriate specialized search engines. In our experiments, Q-Pilot was able to identify the appropriate query category 70% of the time. In addition, Qpilot picked the best search engine for the query, as one of the top three picks out of its repository of 144 engines, about 40% of the time. This paper reports on Q-pilot’s architecture, the query expansion and clustering algorithms it relies on, and the results of our preliminary experiments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology

Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...

متن کامل

مدل جدیدی برای جستجوی عبارت بر اساس کمینه جابه‌جایی وزن‌دار

Finding high-quality web pages is one of the most important tasks of search engines. The relevance between the documents found and the query searched depends on the user observation and increases the complexity of ranking algorithms. The other issue is that users often explore just the first 10 to 20 results while millions of pages related to a query may exist. So search engines have to use sui...

متن کامل

Site-To-Site (S2S) Searching with Query Routing Using Distributed Registrars

Site-To-Site (S2S) searching is a novel Web information retrieval method which uses peer-to-peer framework with CGI as protocol. It helps site owners to turn their websites into autonomous search engines without extra hardware and software costs. Thus, it improves some shortcomings of Conventional Search Engines (CSE) such as centralized and outdated indexing by distributing search engines over...

متن کامل

A New Hybrid Method for Web Pages Ranking in Search Engines

There are many algorithms for optimizing the search engine results, ranking takes place according to one or more parameters such as; Backward Links, Forward Links, Content, click through rate and etc. The quality and performance of these algorithms depend on the listed parameters. The ranking is one of the most important components of the search engine that represents the degree of the vitality...

متن کامل

Query expansion based on relevance feedback and latent semantic analysis

Web search engines are one of the most popular tools on the Internet which are widely-used by expert and novice users. Constructing an adequate query which represents the best specification of users’ information need to the search engine is an important concern of web users. Query expansion is a way to reduce this concern and increase user satisfaction. In this paper, a new method of query expa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computer Networks

دوره 33  شماره 

صفحات  -

تاریخ انتشار 2000